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(57) A method and apparatus lor identifying links in 
an electronic document provides an electronic file as a 
data structure having components and having t}ase 
links thai deline the structural relationship between the 
components, traverses the data structure using the 



base links, and produces a virtual Irik between two com- 
ponents by recognizing a characteristic shared by the 
components. The virtual link is identified when needed 
at run-thme. A function may be performed using the com- 
ponents as components are identified. 
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Description 

BACKGROUND OF THE INVENTION 

[0001] This invention identities components in an 
electronic file. 

[0002] An electronic document lypically has informa- 
tion conleni, such as text, graphics, and tables, and for- 
matting content that directs how fo display the inlorma- 
lion content. Document publishing systems, which in- 
clude word processing systems and desktop publishing 
systems, may slore electronic documents as hierarchi- 
cal data structures. Such structures represent the intor- 
mation contenl and formatting content as nodes con- 
nected to one another in an ordered arrangement. 
[0003] A system traverses a data stnjclure to gather 
data about the structure and to perform operations using 
that dala. To traverse a hierarchical structure, the sys- 
tem Ic^lows a set ol links from one rrode to another 
[0004] The links betweert the nodes are sometimes 
described in terms of family relationships. A node at- 
tached to and above another node In the hierarchical 
structure is referred to as the parent of the tatter node. 
A node attached to and below another node in the hier- 
archical structure is referred to as the child of the latter 
node. Nodes having the same parent are referred to as 
siblings. 

[0005] In addition lo specifying nodal relationships in 
terms of familial links, systems may identify the relation- 
ship between nodes in terms ol next and previous links. 
Next and previous links ignore the familial relationships 
and deal with incremental positions of nodes within a 
document. 

[0006] Familial links, and next and previous links will 
be referred lo as "base links." The base links connect 
every node in the structure and deHne the structure's 
hierarchy. A system uses the base links to traverse the 
structure and discover the structure's organization. The 
structure's organization determines mo order of 
processing iof certain types of operations. For example, 
a spell checker may use the base links lo examine each 
word in an electronic document from the beginning to 
the end of the document. The structure's organization 
also delermines which nodes share behavbr character- 
istics with other nodes. For example, a node may define 
paragraph characteristics that are inherited and relined 
by descendenl nodes. 

[0007] Other than a set ol base links that connect all 
nodes in a hierarchical data structure, a system can 
have sets of direct links to connect nodes in the same 
or in different branches ol a hierarchical data structure. 
Direct links locate nodes that may have an effect upon 
each crther under a certain set ol circurtBtances. For ex- 
ample, if an author inserted a numbered section heading 
into a document, the system could use one set of direct 
links between numbered section heading nodes lo find 
and renumber all subsequent section headings. Direct 
links are also useful in other situations, for example to 



identify components of a detailed outline, identify com- 
ponents of a brief outline, k)cate all index markers, and 
locale all biblkigraphic references. 

5 SUMMARY OF THE INVENTION 

[0008] In one aspect, the invention is directed to a 
computer-implemented method for identilying links in an 
electrwiic file that is expressed as a data structure hav- 
10 ing components and base links. The base links define 
a structural relationship between the components. The 
method of the invention traverses the data structure us- 
ing the base links and produces a virtual link between 
components in the dala structure by recognizing a char- 
ts acteristic shared by the components. 

[0009] The virtual link is identWied when needed at 
run-time. A function, such as a renumbering function or 
a function thai generates text, may be performed using 
each component that is virluaify linked lo another com- 
20 ponent. 

[0010] A plurality erf traversal routines can sequential- 
ly execute to identify a virtual link between components. 
The dala structure can be hierarchical and the traversal 
path used by the traversal routines can be expressed in 
lerms of family, next, and previous structural relation- 
ships. 

[0011] Among the advantages of the invention are 
one or more of the following. The invention only requires 
one set ol base links. Eliminating all other links between 
30 components (e.g.. direct links) eliminates the need lo 
regenerate those other links when the structure is al- 
tered. Furthermore, memory requirements are reduced 
because multiple sets of links are not stored. 

3S BRIEF DESCRIPTION OF THE DRAWINGS 

[0012] The foregoing features and other aspects of 
the invention will become more apparent from the draw- 
ings taken together with the accompanying description, 
•so in which: 

[0013] FIG. 1 is a block diagram of a computer plat- 
form suitable tor supporting virtual navigators in accord- 
ance with the invention. 

[0014] FIG. 2 is a diagram of a hierarchy of compo- 
*5 nents in an electronic document. 

[0015] FIG. 3A is a diagram showing base links and 
virtual links. 

[0016] FIG. 38 is a diagram showing base links and 

virtual links. 

so [0017] FIG. 4 is a flow chart of the context in whffih a 
virtual navigator is used. 

[001 S) FIG. 5 is an illusiraik^n of the cascading virtual 
navigators. 

[0019] FIG. 6 is a flow chart of the ordered-list virtual 
navigator. 
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DETAILED DESCRIPTION 

[0020] Referring now to FIG. 1 , a computer platlorm 
too suitable for supporting an electronic document pub- 
iistiing systsm 101 is shown. The electronic document 
pubRshing system 101 includss one or more virtual r^v- 
igalors 102 on disk Of in main memory. The computer 
platform 100 includes a digital computer 104, a display 
106, a keyboard 108, a mouse or other pointing device 
110. and a mass storage device 112 (e.g., hard disk 
drive, magnelOH3ptical disk drive, or floppy disk drive). 
The computer 104 includes memory 120, a processor 
122, and other cuslomaiy components, such as, mem- 
ory bus and peripheral bus (not shown). 
[0021] An electronc document 1 30 contains informa- 
tion stored on a hard disk or other computer-readable 
medium such as a diskette. A human-perceptible ver- 
sion of the eleclronic document 1 32 is viewable on the 
computer display lOBoras a hardcopy printout oblained 
through operation on the electronic document by a com- 
puter program. 

[0022] Relerring now to FIG. 2. a group of compo- 
nents 201-20S organized as a hierarchical data struc- 
ture 200 is shown. The data structure 200 represents 
an electronic document. The conponents may be sec- 
tion headings, paragraplis, list items, and so forth. For 
example, componeni 202 and component 205 may be 
paragraphs, componenis 203 and 206 may be loot- 
notes, and component 204 may be an index entry. 
[0023] The electronic document publishing system 
1 01 uses base finks lo identify the interrelaiionship ol all 
of the components in the hierarchical structure. Solid 
lines 250-256 between nodes 201-206 in FIG. 2 depict 
the familial, next, and previous links of the data stniclure 
200. The familial links and the next and previous com- 
ponent links may be specified and stored as attribute/ 
value pairs with each component. For example, an at- 
tribute may be a parent link or a child link and a value 
may be a pointer to a parent node or chiH node. 
[0024] Rather than storing and maintaining additional 
links, such as direct links, the system 101 uses virtual 
navigators 102 (FIG. 1} to locate specific componenis 
in the data structure. A virtual navigator is a software 
routine. As the name implies, a virtual navigator idenli- 
lies an apparent path between components by travers- 
ing the data structure through the base links. 
[0025] Shown in FIG. 3A are apparent palh 357 and 
apparent path 358 between footnote cwnponent 203, 
index component 204, and footnote component 206. 
Footnotes 203 and 206 and index component 204 share 
the characteristic that they are anchored to arrather 
compctfienl, such as a paragraph, and are both a type 
of anchor component. An anchor virtual navigator pro- 
duces the virtual link 357 between footr>ote componeni 
203 and index component 204 by using base Ihk 255. 
and produces the virtual link 356 between index com- 
ponent 204 and footncrte component 206 by using base 
link 254, base link 252. and base link 256. 



[0026] Shown in FIG. 3B is a virtual link 359. Virtual 
link 359 is derived from virtual link 357 and virtual link 
358. The footnote virtual navigator produced virtual link 
359 using virtual link 357 and virtual Imk 358, which the 

5 anchor virtual navigalor produced. 

[0027] The electronic document publishing system 
101 provides a virtual navigator for each type of compo- 
nent that needs (o be identified. Examples of virtual nav- 
igators 1 02 include a footnote virtual navigator that lo- 

10 cates ail footnotes, an ordered-list virtual navigator thai 
locates all ordered lists, a numbered paragraph virtual 
navigator that kx^ates all numbered paragraphs, and a 
paragraph virtual navigator thai locates all paragrai^s. 
[0028] In an object-oriented environment, a base vir- 

'5 luaf navigator class is the class from which ail other vir- 
tual navigator classes are derived and thus all other vir- 
tual navigator classes inherit features from the base vir- 
tual navigator class. Each type oJ virtual navigator 102 
is defined by its own class and each virtual navigator 

20 102 is an object instantiated from that class. All virtual 
navigators 1 02 can inherit and use functions defined for 
any ancestral virtual navigator classes. 
[0029] Each virtual navigator 1 02 uses the base links 
of the hierarchical data structure or virtual links provided 

25 by other virtual navigators and identifies a set of com- 
ponents by recognizing common characteristics shared 
by the sel ol componenis. The virtual navigators 102 
need not construct or store a data structure on a com- 
puter medium or in a computer memory after identifying 

30 a sel of components. A chain of components is discov- 
ered dynamKally and each component is used for a spe- 
■ cific function at the time the component is discovered 
before the virtual navigator searches for another com- 
ponent in the chain. 

35 [0030] A virtual navigator may be used when an au- 
thor adds, deletes, moves, or nrKxjif ies in some way, one 
or more conponents in the data structure 200. If the 
modification affects the way in which other components 
are numbered, a renumbering routine may be called to 

•fo renumber affected paragraphs. That routine may use a 
numbered patagtaph virtual navigalOT, a footnote virtual 
navigator, or both, to identify componenis thai need re- 
numbering. 

[0031] As an example, a virtual navigator may be 
« called when a new section heading is inserted between 
existing section headings in an eleclronic document. 
Thus, if a new sectkm heading is inserted between Sec- 
lion 2.0 and Section 3.0, the virtual navigator identifies 
all numbered section headings from Section 3.0 through 
50 the end of the electronic document. When a section 
heading is identified, a routine, such as the routine that 
called the virtual rtavigator, renumtwrs the heading. 
Section 3.0 wilt become 4.0, Section 3.1 virill become 
4.1, aiKf soon. 

55 [0032] The virtual navigators 1 02 use protocols based 
on traversal methods that obtain the parent, next child, 
previous child, first chikl, last child, and next and previ- 
ous comF>onents. Each virtual navigator 102 imple- 
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ments al least one traversal routine tailored (oa specific 
type of component and considers the linkage require- 
ments for the conponenl type. For example, a num- 
bered paragr^h virtual navigator has three traversal 
routines, "GetParenf, "GetNext", and "GetPrev". that 
recognize a numbered paragraph component. A para- 
graph virtual navigator has traversal routines 'GetPar- 
enf, "GetNext", "GelPrev", "GetNexlChiW. "Get- 
PrevChild", "GelFirslChild". and 'GetLastChifd" that 
recognize paragraph components. 
[0033] FIG. 4 illustrates the use ol a virtual navigator 
First, the electronic document is stored as a hierarchical 
data structure 200 having base structural links (step 
410). When the electronic document publishing system 
101 needs to perform a task on particular components, 
a virtual navigator is called to identity the componenls. 
A link between identified (xwnponents is not stored, so 
the virtual navigator produces a virtual link between 
componenls as the components are identified (siep 
420). The virtual navigator derives the virtuatlink by call- 
ing other virtual navigators. The virtual navigators use 
the base links, which may simply be pointers frcsn one 
component to another, of the hierarchical data structure 
1o identify the particular set of components. 
[0034] To derive a virtual link, a virtual navigator iden- 
tifies a component having a specific characteristic (step 
460), which will be discussed. The routine that called 
the virtual navigator may perform an operalkxi usingihe 
identified component (step 470). After the operatran is 
performed, the virtual navigator may be called again to 
search for another component having the specified 
characi eristic. Ttie cycle of calling the virtual navigator 
and performing a function is repeated until the calling 
rouline determines that all components were identified. 
For example, the calling routine may need the entire hi- 
erarchical data structure traversed or only need to iden- 
tify components in a specific branch. 
[0035] Due to similar linkage requirements, virtual 
navigators 102 cat! other virtual navigators that identify 
other types of components. Together, the virtual navi^- 
tors can traverse the entire hierarchicai data structure 
via the base links. For e)rample, an oideied-ltst corrpo- 
nent requires a numbered paragraph component to be 
its parent component, and a numbered paragraph com- 
ponent requires a paragraph component to be its parent. 
In this case, an ordered-llsl virtual navigalorcailsa num- 
bered paragraph virtual navigator, and the numbered 
paragraph virtual navigator calls the paragraph virtual 
navigator. 

[0036] Shown in FIG. 5 is a conceptual representation 
of three virtual navigators interacting with one anoUier 
to identify ordered-list components using GetNext 
traversal routines. An ordered-list class is derived from 
a numbered paragraph class and a numbered para- 
graph class is a class derived from a paragraph class. 
The ordered-list virtual navigator obtains the next or- 
dered-list component (step 460*) by sequenlially obtain- 
ing the next numbered paragraph until an ordered-list 



component is found (step 520). To c*tain the next num- 
bered paragraph, the numbered paragraph vidual nav- 
igator sequentially gets the next paragraph until a num- 
bered paragraph is found (step 530). This cascading ef- 
i feet can continue up fo the virtual navigator thai identi- 
fies a component in the class from whrch all componenl 
classes are derived. 

[0037] Relen^ing to FIG. 6, is an illustrative example 
of the ordered-iist virtual navigator's GetNext traversal 

'0 routine 460' identifying components that are ordered 
lists. The ordered-iisl virtual navigator's GelNext routine 
460' begins by getting the next numbered paragraph in 
the structure (step 521 ). The ordered-tisi virtual naviga- 
tor's GetNext routine 460' calls the numbered paragraph 

'5 virtual navigator's GetNext rouline 520 (step 521), The 
ordered-list virtual navigator tests whether a numbered 
paragraph was returned (step 522) and whether the 
numbered paragraph is an ordered-list compmenl (step 
524). If an ordered-list component was relumed, the or- 

20 dered-list virtual navigator returns (step 526) and the 
calling routine can perform a prescribed function using 
the ordered-list component. For example, Uie function 
may increment a section number If a numbered para- 
graph was returned, but it was not an ordered-list bom- 

^5 ponenl, the ordered-list virtual navigator continues to 
search lor an ordered-list component. If a numbered 
paragraph was not returned, the entire structure was tra- 
versed and the ordered-list virtual navigator returns to 
the calling routine (step 526). 

30 [0038] Getting the next numbered paragraph follows 
a similar technique. To obtain the next numbered para- 
graph, the nunnbered paragraph virtual navigator's Get- 
Next traversal routine 520 calls the paragraph vinual 
navigator's GetNext traversal rouline 530 (step 531), 

3S tests whether a paragraph was retumed (step 532), and 
if a paragraph was returned, tests whether the para- 
graph is a numbered paragraph (step 534). If the para- 
graph was not a numbered paragraph, the numbered 
paragraph virtual navigator's GetNext routine 520 re- 
peats steps 531-534 until a numbered paragraph is re- 
turned orlhe numbered paragraph virtual navigator has 
traversed the structure. 

[0039] To get the next paragraph, the paragraph vir- 
tual navigator obtains the next component because a 

^ paragraph Is derived Irom a component The paragraph 
virtual navigator's GetNext routine 530 is called to obtain 
the next componenl. The paragraph virtual navigator's 
GetNext routine calls the componenl virtual navigator's 
GelNext traversal routine (step 541) and tests whether 

so a component was returned (step 542), arvj if so, whether 
the component is a paragraph (step 544). If the compo- 
nent was not a paragraph, the paragraph virtual naviga- 
tor's GetNext rouline repeats steps 541 -544 until a par- 
agraph is retumed or the paragraph virtual navigator has 

55 traversed the struclure. 

[0040] Other embodiments are within the scope of the 
following claims Rather than one vidual navigator call- 
ing another virtual navigator, a virtual navigator can in- 
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elude the (unctionatity of several virtual navigators. Ad- 
ditional object classes {e.9., containers), traversal func- 
tions, and navigators may be imptemenled. Virtual nav- 
igators can produce virtual links for linked data slnjc- 
luresotherthan hierarchical data structures. Other func- 
tions may be performed after a component is identified, 
including generating bibliographies, endnotes, tables of 
conlenis, and indices. 



Clafms 

1. A computer- implemented method tor identilying 
links in an electrons file that is expressed as a dala 
structure having a plurality of components and base 
links that define a structural relalionship between 
the components, the method comprising: 

traversing a data structure using a plurality 0I 
base Bnks, and 

producing a virtual link between a lirsi compo- 
nent and a seccwid component in the data struc- 
ture by recognizing a characteristic shared by 
the first component and the second compo- 
nent. 

2. The method of claim 1, wwherein Ihe virtual link is 
identified when needed at run-time. 

3. The method ol claim 1 , further comprising: 

performing a function using the second com- 
ponent before the traversal of the dala slmcture is 
completed. 

4. The method ol claim 1 , further comprising: 

provkjing a plurality of traversal routines that 
sequentially execute to identify a vlrlual link be- 
tween components. 

5. The method of claim 1 , wherein the second compo- 
nent inherits features from a component class, and 
a traversal routine recognizes the second compo- 
nent by recognizing members ol Ihe ccxnponent 
class until the second component is found. 

6. The method of claim 5. wherein the data structure 
is a hierarchical data structure and the traversal rou- 
tine specifies a traversal path in terms of family, 
next, and previous structural relaltonships. 

7. The method of clavn 3, wherein Ihe electronic file is 
an electronic document. 



formed on the second compoient generates text. 

10. The method of claim 7, wherein the tunclton per- 
formed on the second component locates a text 

5 string. 

11 . The method of clatm 7, wherein the traversal routine 
identifies a plurality of virtual links between compo- 
nents. 

10 

12. The method ol claim 11, wherein the data structure 
is a hierarchk:al data structure, the virtual links rep- 
resent a hierarchical subset of components rn the 
hierarchical data structure, the traversal routine 

15 specifies a traversal path in terms of family, next, or 
previous structural relationships, and the traversal 
routine specifies components according to data 
type. 

20 13. A computer-implemented method lor identifying 
links in an electronic file at run time, comprising: 

providing an electronic file as a hierarchical da- 
ta structure having a plurality of components 
^5 and a plurality of base links that define a struc- 

tural relationship between the componenis; 
traversing the hierarchical data structure using 
a plurality ol traversal routines that use the 
base links; 

30 defining the traversal routines as classes that 

inherit features from other traversal routine 
classes; 

having each traversal routine Identify a plurality 
links between a plurality of components in 
3S the hierarchical data structure by recognizing a 

characteristic shared by the components; and 
performing a function using each identified 
component at the time the component is iden- 
tified. 

40 

14. A computer program operating on an electronic file 
arranged as a data structure having a plurality of 
components and a plurality of base links that define 
a structural relationship between the components, 

IS the computer program residing on a computer- 
readable medium, comprising inslruclions causing 
a computer to: 

provkJe at least one traversal routine, with the 
traversal routine identifying a link between a first 

50 component and a second component in a data 
structure by traversing the data stnicture using the 
base links. 



8. The method ol claim 7. wherein the function per- 
formed on the second component is a renumbering ss 

function. 

9. The method of claim 7. wherein the lurtctton per- 



1 5. Thecompuler program of claim 1 4, wherein the sec- 
ond component inherits characteristics from a class 
of components and the traversal routine identifies 
the link by recognizing members of the class of 
components. 
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16. The computer program of claim 14, further compris- 
ing the instruction causing a computer to: 

perform a functton using the second compo- 
nent at the time the component ts identified. 

17. The computer program of claim 14, wherein the 
electronic file is an electronic document. 



18. The computer program of claim 1 7, further compris- 
ing the instruction causing a computer to: io 

produce a plurality of linHs between a plurality 
ot components in the data structure. 

19. The computer program of claim 1 7, further compris- 
ing the instruction causing a computer to: ts 

perform a function using the second compo- 
nent and subsequently linked components before 
traversal of Uie data structure is completed. 

20. The computer program of claim 19, wherein the 20 
function performed is a renumbering function. 

21. The computer program of claim 19, wherein ihe 
function performed generates text. 

25 
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